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Abstract. In multiobjective combinatorial optimization, there exists 
two main classes of metaheuristics, based either on multiple aggrega- 
tions, or on a dominance relation. As in the single-objective case, the 
structure of the search space can explain the difficulty for multiobjective 
metaheuristics, and guide the design of such methods. In this work we 
analyze the properties of multiobjective combinatorial search spaces. In 
particular, we focus on the features related the efficient set, and we pay 
a particular attention to the correlation between objectives. Few bench- 
mark takes such objective correlation into account. Here, we define a 
general method to design multiobjective problems with correlation. As 
an example, we extend the well-known multiobjective AA'-landscapes. 
By measuring different properties of the search space, we show the im- 
portance of considering the objective correlation on the design of meta- 
heuristics. 



1 Introduction 

Multiobjective combinatorial optimization (MoCO) problems, where several cri- 
teria have to be optimized simultaneously, receive more and more interest in the 
field of search algorithms. One of the main issues in multiobjective optimization 
is the Pareto dominance relation, which gives a partial order between feasible 
solutions. Roughly speaking, a given solution dominates another solution if it 
is better according to all objective functions. A possible approach in solving a 
multiobjective problem consists in finding the whole set of non-dominated so- 
lutions, called the efficient set, or a subset that is close to it. This efficient set 
plays a central role in the structure of the search space. 

The design of metaheuristics for multiobjective combinatorial optimization 
is a real challenge, as it is problem-dependent. Like in single-objective optimiza- 
tion, the structure of the search space can explain the ability of multiobjective 
metaheuristics. Two main classes of multiobjective metaheuristics can be dis- 
tinguished. The first ones, known as scalar approaches, are based on multiple 
scalarized aggregations of the objective functions. However, they are only able 



to find a subset of efficient solutions, called supported efficient solutions. The 
second ones, known as Pareto-based approaches, directly or indirectly focus the 
search on the Pareto dominance relation. Moreover, when the size of the effi- 
cient set is too large, a nietaheuristic should manipulate a limited-size solution 
set during the search, and this limit is related to the size of the efficient set. In 
addition, connectedness is related to the property that efficient solutions arc con- 
nected with respect to a neighborhood relation [I]. When connectedness holds, 
it becomes possible to find the whole efficient set by iteratively exploring the 
neighborhood of the current approximation, initialized with at least one efficient 
solution. This strategy is often used explicitly, or implicitly by Pareto-bascd 
approaches. For the design of metaheuristics for MoCO, three main questions, 
related to the efficient set properties, are of our interest in this paper: 

(?) What is the cardinality of the efficient set? Can we pretend to identify or 
approximate the whole set of efficient solutions, or should we consider a 
mechanism to bound the size of the approximation set? 

(m) How many efficient solutions are supported? Is a scalar approach able to 
identify or approximate enough efficient solutions? 

(iii) Are efficient solutions connected with respect to a neighborhood operator? 
Is it possible to identify or approximate additional efficient solutions by a 
simple local search initialized with a subpart of the efficient set? 

In particular we want to study such properties according to the objective corre- 
lation, as it seems to largely affect the solutions of MoCO problems [2] and the 
behavior of metaheuristics [3] . Few benchmark takes the correlation between ob- 
jectives into account. To the best of our knowledge, the multiobjective quadratic 
assignment problem [3] should be the single one. In this problem, a parameter can 
tune the correlation between different pairs of objectives. Another well-known 
benchmark, the multiobjective iVi^-landscapes [S] facilitate the study of prob- 
lem structure in multiobjective optimization. In this class, the epistatic degree, 
which is the degree of non-linearity of the problem, can be tuned very precisely. 
In this work, in order to study the problem structure, and in particular the 
structure of the efficient set, we define a general method to tunc the correlation 
between all pairs of objectives very precisely. As an example, we define the mul- 
tiobjective pAf A'-landscapes, an extension of multiobjective A^A'-landscapes 
with objective correlation. With such a benchmark, we can study the problem 
structure according to the objective space dimension, the epistasis and especially 
the objective correlation, and then highlight some guidelines for the design of 
efficient multiobjective metaheuristics. 

In summary, the contributions of this work can be stated as follows. First, we 
propose a method to precisely tune the correlation between objective functions. 
It is applied to the design of MA^AT-landscapes, but it can easily be generalized 
to other problems. Second, we show the influence of the objective correlation 
on some properties of the efficient set (and its image in the objective space): 
its size, the proportion of supported solutions, and the connectedness of effi- 
cient solutions. Third, we bring those properties with the design of local search 
metaheuristics in order to help the practitioner to make proper choices between 



several classes of methodologies. The reminder of the paper is organized as fol- 
lows. Section 2 is dedicated to multiobjective combinatorial optimization, multi- 
objective metaheuristics, as well as single- and multi-objective A'^ if -landscapes. 
Section 3 presents the design of pA/A^A'-landscapes. We conduct a theoretical 
analysis and an experimental study to show the sharpness of the objective cor- 
relation. Section 4 deeply analyzes the efficient set structure on this new class of 
problems according to the objective space dimension, the non-linearity and espe- 
cially the objective correlation. The consequence on the design of multiobjective 
metaheuristics arc discussed in the last section. 

2 Background 

2.1 Multiobjective Combinatorial Optimization 

A large number of real-world optimization problems are multiobjective by na- 
ture, because several criteria have to be considered simultaneously. A MoCO 
problem can be defined by a set of Af > 2 objective functions (/i, /2, • . ■ , /m), 
and a discrete set X of feasible solutions in the decision space. Let Z = f{X) C 
be the set of feasible outcome vectors in the objective space. In a maxi- 
mization context, a solution x £ X dominates a solution x' £ X, denoted by 
X y x', iff Vi g {1,2,...,M}, f,{x) > and 3j 6 {1,2,...,M} such as 

fj{x) > fj{x'). A solution x £ X is said to be efficient (or non- dominated, 
Pareto optimal), if there does not exist any other solution x £ X such that x 
dominates x. The set of all efficient solutions is called the efficient set (or Pareto 
optimal set), denoted by Xe, and its mapping in the objective space is called the 
Pareto front. A possible approach in MoCO is to identify a minimal complete 
efHcient set, i.e. one efficient solution mapping to each point of the Pareto front. 

However, generating the entire efficient set of a MoCO problem is often in- 
feasible for two main reasons [S]. First, for most MoCO problems, the number of 
efficient solutions is known to be exponential in the size of the problem instance. 
In that sense, most MoCO problems arc said to be intractable. Second, deciding 
if a feasible solution belongs to the efficient set is NP-complcte for numerous 
MoCO problems, even if none of its single-objective counterpart is NP-hard. 
Therefore, the overall goal is often to identify a good efficient set approximation. 
To this end, metaheuristics in general, and evolutionary algorithms in particu- 
lar, have received a growing interest since the late eighties, and multiobjective 
metaheuristics still constitute an active research area. 

2.2 Metaheuristics for Multiobjective Combinatorial Optimization 

Two main classes of metaheuristics for MoCO can be distinguished, see for in- 
stance [7]. The first ones, known as scalar approaches, are based on multiple 
scalarized aggregations of the objective functions. The second ones, known as 
Pareto-based approaches, directly or indirectly focus the search on the Pareto 
dominance relation (or a slight modification of it). These two kinds of approaches 
can also be hybridized in a two-phase way. 



Initial approaches dealing with MoCO are based on successive transforma- 
tions of the original multiobjective problem into single-objective ones by means 
of a scalarization strategy. Most of the time, scalar approaches are based on 
a weighted-sum aggregation of the objective functions, that can be defined as 
follows. Vx e X: fx{x) = Y^ILi^i M^) where A, > for aU i e 
The problem is now to identify a (single) solution that maximizes fx. For any 
given weighting coefficient vector A, if x* — arg max^^x f\{x)^ then x* is an effi- 
cient solution. Multiple weighting coefficient vectors can be itcratively defined so 
that several non-dominated solutions arc identified (or approximated). For each 
scalarization, the corresponding solution is incorporated into an approximation 
set, whose dominated solutions are then discarded. However, in the combinato- 
rial case, a number of efficient solutions are not optimal for any definition of fx. 
They are known as non-supported (efficient) solutions. On the contrary, there 
exists supported (efficient) solutions whose corresponding objective vectors are 
located on the convex hull of the Pareto front. The set of all supported efficient 
solutions will be denoted by Xse- As a consequence, the proportion of non- 
supported solutions over the efficient set has a direct implication on the ability 
of scalar approaches to find a proper non-dominated set approximation. 

Over the years, other types of approaches were proposed. They are based 
on the explicit or implicit use of the Pareto dominance relation, that allows to 
define a partial order between feasible solutions. The basic idea is to maintain 
a set solutions (typically a population or an archive of mutually non-dominated 
solutions) . The content of this set is then itcratively updated with new solutions 
built by means of variation or neighborhood operators. The update of this set is 
based on a specific decision on which solutions to accept or to choose for further 
manipulation. This process is iterated until no further improvement is possible 
or another stopping condition is fulfilled. In the end, this set corresponds to the 
approximation outputted by the algorithm. The implicit goal is to identify an 
approximation whose image in the objective space is (i) close to and (m) well- 
spread along the Pareto front. However, as the number of efHcient solutions is 
often intractable, we generally have to design specific strategies to limit the size 
of the approximation set [8]. As a consequence, the cardinality of the efficient 
set also plays a major role on the design of multiobjective metaheuristics. 

More recently, the neighborhood structure of the efficient set has been claimed 
to play a crucial role for the development of efficient metaheuristics. One of these 
properties is known as connectedness |1I9] . Let us define a graph such that each 
node represents an efficient solution, and an edge connects a pair of nodes if 
the corresponding solutions are neighbors with respect to a given neighborhood 
operator [l]. This graph is called the efficient graph. A neighborhood operator 
is a function Af : X ^ 2-^ that assigns a set of solutions Af{x) C X to any 
solution X £ X. J\f{x) is called the neighborhood of x, and a solution x' £ Af{x) 
is called a neighbor of x. The efficient set is said to be connected if there exists 
a path between every pair of nodes in the graph. In other words, each efficient 
solution is located in the neighborhood of at least one other solution from the 
efficient set. This property has later been extended to the notion of cluster by 



introducing an arbitrary distance separating two efficient solutions [10| . When 
connectedness holds, it becomes possible to find all the efficient solutions by 
means of the iterative exploration of the neighborhood of the current approx- 
imation by starting with one (or more) solution(s) from the efficient set. This 
gives rise to a two-phase approach: (i) identify a number of (typically supported) 
non-dominated solutions (ii) improve the set of non-dominated solutions by ex- 
ploring their neighborhood. 

2.3 NK- and MTViiT-Landscapes 

The family of iV/C-landscapes [TT] is a problem-independent model used for 
constructing multimodal landscapes. N refers to the number of (binary) genes in 
the genotype {i.e. the string length) and K to the number of genes that influence 
a particular gene from the string (the epistatic interactions). By increasing the 
value of K from to (A^ — 1), iVJC-landscapes can be gradually tuned from 
smooth to rugged. The fitness function (to be maximized) of a jVJT-landscape 
/nk ■ {0, 1}^ — > [0, 1) is defined on binary strings with N bits. An 'atom' with 
fixed epistasis level is represented by a fitness component fi : {0, 1}^+^ — > [0, 1) 
associated to each bit i € N. Its value depends on the allele at bit i and also 
on the alleles at K other epistatic positions {K must fall between and — 1). 
The fitness fNK{x) of a solution x € {0, 1}^ corresponds to the mean value of 
its N fitness components fi: 

1 ^ 

i=l 

where {ii, . . . ,iK} C — + N}. Several ways have been proposed 

to set the K bits from the bit string of size . Two possibilities are mainly used: 
adjacent and random neighborhoods. With an adjacent neighborhood, the K 
bits nearest to the hit i G N are chosen (the genotype is taken to have periodic 
boundaries). With a random neighborhood, the K bits are chosen randomly on 
the bit string. Each fitness component fi is specified by extension, i. e. a number 
yxi,xi-^,...,xij^ from [0,1) is associated with each element (xi,Xi-^, . . . ,Xi^) from 
{0, 1}^'"''^. Those numbers are uniformly distributed in the range [0, 1). 

More recently, a multiobjective variant of TVi^-landscapes (namely MNK- 
landscapes) [S| have been defined with a set of AI fitness functions: 

1 ^ 

i=l 

The numbers of epistasis links Km can theoretically be different for each fitness 
function. But in practice, the same epistasis degree Km = K for all m G 
is used. Each fitness component fm,i is specified by extension with the numbers 
VxlCxi ^,...,xi ^ ■ In the original MA^isT-landscapes [5], these numbers are ran- 
domly and independently drawn from [0, 1). As a consequence, it is very unlikely 
that two different solutions map to the same point in the objective space. 



3 pM N K-handscaipes: Multiobjective A/^l^T-Landscapes 
with Correlation 



In this section, we define the CMNK- and the pMA^ii'-landscapes, which are 
based on the MA^i^-landscapes [5]. In this multiobjective model, the correlation 
between objective functions can be precisely tuned by a correlation matrix. It 
allows to study the simultaneous influence of objective space dimension, non- 
linearity and objective correlation on the main properties of multiobjective fit- 
ness landscapes. The construction of landscapes is defined and the analytic proof 
of the correlation between objectives, completed with an experimental study, are 
given. Note that the proposed approach to tune the objective correlation can be 
applied to other MoCO problems where the objective functions are summing 
objectives, share the same definition, but are computed with different cost or 
profit matrices. This is the case, for instance, of the multiobjective knapsack, 
traveling salesman and quadratic assignment problems |4l6j . 

3.1 Definition 

In the proposed CA/TV/C-landscapes, the epistasis structure is identical for all 
the objective functions: Vto £ Km = K and Vm e [1,A^], Vj e [l,iirm], 

im,j = ij- The fitness components are not defined independently. The num- 
bers {Ux'^^xi^ ,...,xi J • ■ • : Vxllxi^ ,...,xi ) follow a multivariate uniform law of dimen- 
sion M, defined by a correlation matrix C. Thus, the y's follow a multidimen- 
sional law with uniform marginals and the correlations between y™'*s are defined 
by the matrix C. So, the four parameters of the family of CM A^i^-landscapes 
arc (i) the number of objective functions M , (ii) the length of the bit string N, 
(in) the number of epistatic links K, and (iv) the correlation matrix C. 

The matrix C is a symmetric positive-definite matrix where — — numbers 
can be defined. In order to limit the number of free numbers in matrix C, we 
define the matrix Cp ~ (c„p) which has the same correlation between all the 
objectives: c„„ = 1 for all n, and c„p = p for all n p. In this case, we denote 
CMiV iiT-landscapes by pMA^ii'-landscapes, and the original MA^i^-landscapes 
are equivalent to yoMA^if-landscapes with p = 0. However, it is not possible 
to have the matrix Cp for all p between [—1,1]. Cp must be positive-definite: 
Vu G , u^CpU > 0. So, p must be greater than jg^j. For two-objective 
problems, all the correlations between [—1, 1] are possible. However, for three- 
objective problems, the correlation p must fall in [—0.5,1]. Of course, if one 
wants to study very negative correlations between some pairs of objectives, it is 
possible to design a matrix C that keeps the condition that C is positive-definite. 

To generate random variables with uniform marginals and a specified correla- 
tion matrix C, we follow the work of Hotelling [T^. We first generate (Zi, . . . , Zm) 
a multinormal laws of means and correlation matrix R = 2sin(-|C). Then, the 
values Zi = ^(Zi) are uniformly distributed with a correlation matrix C, where 
is the univariate normal cumulative density function. Note that this is not the 
only way to generate a multivariate uniform law. 



3.2 Correlation between Objective Functions 



The construction of Cil/iV/v -landscapes defines correlation between the ?/'s but 
not directly between the objectives. In this section, we prove by algebra that the 
correlation between objectives is tuned by the matrix C . This proof is followed 
by an experimental analysis. 

Theoretical analysis. Let Fm = {fmNK{x)) be the fitness vector values of the 
2-^^ solutions with respect to objective m. The correlation between objective n 
and p is: cor{Fn, F.p) = cov(F,^,Fp) ^^lere an and dp are the standard deviations 

of fitness values over the landscape of the n*'' and p*^ NK fitness functions. F„ 
(resp. Fp) corresponds to the average value of the TV vectors Fni (resp. Fpj) of 
fitness component values: 

1 ^ 

cov{F„,Fp) = ^ cov{F„,,Fpj) 
i,j=i 

By definition, when i ^ j, cov{Fni,Fpj) = and cov{Fni,Fpi) = Cnp ■ (Jm ■ CTpi, 
where c„p is the correlation defined in the matrix C, and ani (resp. Upi) is the 
standard deviation of fitness component i. The correlation between objectives n 
and p becomes: 

2^i—l ^ni^pi 



COr(yFm Fp) — Cnp 



Un^p 

By construction of the fitness functions, the following relation between standard 
deviations stands = Y^=\ '^ni (resp. for dp). On average, the cr„i are equal 
to the standard deviation of the uniform law on [0, 1). 

E{cor{Fn,Fp))^Cnp (1) 

Then, the average of the correlations between objective functions are given by 
the matrix C. In the pA/iVi^-landscapes, the parameter p allows to tune very 
precisely the correlation between all pairs of objectives. 

Experimental study. In order to enumerate the search space exhaustively, we 
conduct an empirical study for N = 18. In order to minimize the influence of 
the random creation of landscapes, we considered 30 different and independent 
landscapes for each parameter combinations: p, M, N and K. The measures re- 
ported are the average over these 30 landscapes. The remaining set of parameters 
are given in Table [1] Figure [T] shows the average^ of the Spearman correlation 
coefficient according to the parameters p, M and K. This confirms the result of 
equation ([T]), the correlation coefficients are very close to the expected value p. 

Then, in the pMA^A'-landscapes, the parameter p tunes very precisely the 
correlation, and, in addition to the correlated multiobjective quadratic assign- 
ment problem [4], it is possible to tune this correlation between all pairs of 



^ For M > 2, there are several correlation coefficients. We report here the average 
correlation coefficients over all the objectives (these values are all very close). 



Table 1. Parameters used in the paper for the experimental analysis. 



Parameter 


Values 


N 


18 




M 


{2,3,5} 




K 


{2,4,6,f 


\ 10} 


P 


{-0.9,- 


0.7, -0.4, -0.2, 0.0, 0.2, 0.4, 0.7, 0.9} such that p > 




-1 -0.5 0.5 1 -1 -0.5 0.5 1 

P P 

Fig. 1. Average values of the correlation between objectives according to the 
parameter p. The number of objectives is M ~ 2 (left) and i\/ = 5 (right). 



objectives. In the following, we study the influence of epistasis, number of ob- 
jective and objective correlation on the properties of the efficient set for the 
pMA^i^T-landscapes model. 

4 Analysis of the Efficient Set Properties 

In this section, we conduct experiments on the pAfA^iiT-landscapcs in order to 
study different properties of the efficient set: its cardinality, the number of sup- 
ported solutions and connectedness-related features. The instances under study 
are defined by the parameter setting given in Table [T] 

4.1 Cardinality of the Efficient Set 

Figure [2] shows the proportion of efficient solutions in the search space according 
to parameters K, p and M of pMA^if-landscapes. First of all, the epistatic 
parameter K does not seem to have a major influence on the results. At the 
opposite, the objective correlation p modifies the number of efficient solutions to 
several orders of magnitude. Indeed, the proportion decreases from 10~^ to 10~^ 
{p 6 [—1, 1]) for two-objective problems, and from 10"-^ to 10~^ {p 6 [—0.2, 1]) 
for M = 5. With respect to the number of objective functions {M — 2, 3, and 5), 
the size increases of several decades according to M. For a negative objective 



correlation (p = —0.2), the proportion goes from lO^"* to 10^"'^ whereas it goes 
from 10^^ to 10^^ for a positive correlation (p = 0.9). 

The influence of objective correlation on the efficient size becomes as impor- 
tant as the number of dimension of objective space. A lot of solutions becomes 
efficient when the anti-correlation is high. Now, let us suppose that we want to set 
or to bound the size of the approximation set by 100. Such a parameter setting is 
often used while handling a population or an archive of non-dominated solutions 
in a multiobjective metaheuristic. For the pAf iVX-landscapes, the proportion of 
non-dominated solutions over the search space should be roughly around 4- 10~^ 
(this goes up to 8 • 10~* for 200 solutions). Whatever the correlation value p, 
a 100— solution approximation set always allows to store all the efficient set for 
two-objective problems. However, this is not the case for a higher dimension of 
the objective space. For instance, for M = 5, 100 solutions suffice to store the 
whole efficient set for a high objective correlation only [p > 0.5). In other words, 
for p < 0.5; wc cannot pretend to identify the whole efficient set exhaustively by 
handling a 100— solution approximation set. 

To summarize, when the number of objective increases, and even more when 
the objectives are in conflict, the size of the efficient set becomes very large, and 
then tend to be intractable. In this case, it is not reasonable to pretend to identify 
the whole efficient set, and a limited-size approximation should be considered. 
This first result shows the importance to design a benchmark where the objective 
correlation can be tuned precisely, even when M > 2. Such a property should 
be taken into consideration for the development of metaheuristics, when the 
number of objective becomes too large, and when there is a high anti-correlation 
between objective functions. A special attention should be paid with regards to 
the size of the approximation set handled by the search approach. 

4.2 Number of Supported Efficient Solutions 

Figure [3] shows the proportion of supported solutions in the search space ac- 
cording to parameters K, p and M of pM A^iiT-landscapes. Mainly, this number 
follows the size of the efficient set: the epistatic parameters K has low influ- 
ence on the size. When the objective space dimension increases or the objective 
correlation decreases, the number of supported solutions gets higher. The differ- 
ence with the size of the efficient set becomes more clear in Figure 21 It gives 
the proportion of supported solutions over the efficient set. This proportion is 
nearly independent of the epistasis degree of the problem (K). However, when 
the objective correlation increases, this proportion increases. For a high objec- 
tive correlation (p = 0.9), nearly all solutions become supported (this is even the 
case for some instances). The same observation can be made with the number 
of objectives. The number of supported solution increases with the cardinality 
of the efficient set, but the former increases faster than the latter. 

While putting this property in relation with the design of a metaheuristic, 
we can conclude that scalar approaches should become more appropriate when 
the number of objective is low, and when the objective correlation is high. 
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Fig. 2. Average ratio of the number of efficient solutions compared to the size of 
the search space (2^) according to parameter p (top left M = 2, right M = 5), 
and according to parameter K for different number of objectives (bottom left 
p = —0.2, right p ~ 0.9). Notice the log y-scale. 



4.3 Connectedness of the Efficient Set 

In this section, the efficient graph (see Section 12. 2p , i. e. the graph of efficient 
solutions where edges arc induced by a given neighborhood operator, is analyzed. 

Firstly, the efficient graph can be composed of several connected components. 
In this case, all the efficient solutions are not connected with respect to the 
neighborhood relation. Figure [S] shows the average ratio of the larger connected 
component size induced by Hamming distance I. Nearby all solutions of the 
efficient graph are in the same component when the objective space dimension is 
high (M = 5) and when the objective correlation is negative (p = —0.2). At first 
sight, such a result seems to be explained by the very large size of the efficient 
set obtained for those parameters (see Section HTT|) . However, we compared this 
result to the size of the larger component of a graph of same size, but where the 
nodes are now random solutions. We found out that this size is much smaller 
than the one of the efficient graph, in particular when the epistatic degree is low 
(170 times larger for M = 5, p = —0.2, and K = 4). Consequently, the ratio 
size of the larger component is not the consequence of the number of efficient 
solutions only . 

Contrary to the size of the efficient set, the size of the largest connected 
component seems to depend on the epistatic degree K. Indeed, this size decreases 
when K increases. As an example, for M = 2 and p = —0.4, the ratio size is 
0.42 for if = 2 and lower than 0.1 for K — 10. When the epistatic degree is low, 




Fig. 3. Average ratio of the number of supported efficient solutions compared 
to the size of the search space (2^) according to parameter p (top left M = 2, 
right M = 5), and according to parameter K for different number of objectives 
(bottom left p = —0.2, right p = 0.9). Notice the log y-scale. 



the objective values of neighboring solutions are correlated, and this correlation 
decreases with the epistatic degree [13j . This could explain our experimental 
result: If a solution is efficient, the probability that one of its neighbors is also 
efficient gets higher when the epistatic degree gets lower. 

The objective correlation and the number of objective functions also affect 
the size of the largest connected component. But the variation is different with 
respect to the number of objective functions. For M = 2, the ratio of the larger 
component size increases when the objective correlation increases (apart from 
K ~ 2). For M = 5, the ratio decreases when the objective correlation increases. 
As a consequence, excepting when the efficient set is intractable (that is, when 
there is a high objective space dimension and a high anti-correlation degree), 
we cannot expect to reach all the efficient solutions by iterativcly exploring 
the neighborhood of an approximation set initialized with one non-dominated 
solution. However, when there arc several connected components for the efficient 
graph based on Hamming distance 1 (see the definition of cluster in Section [2?2|) . 
the distance between those components could be small. 

When efficient solutions are connected with respect to a neighborhood struc- 
ture related to Hamming distance k and not fc — 1, the efficient set is then said 
to be /c-connected [TU]. When the minimal distance k is around 9, which is the 
average distance between random solutions, we can say that the distance be- 
tween efficient solutions is large. Figure |6] shows the average minimal distance k 



Fig. 4. Average ratio of the number of supported efScient solutions compared to 
the size of the efficient set according to parameter p (top left M = 2, right M = 
5), and according to parameter K for different number of objectives (bottom left 
p = —0.2, right p = 0.9). Notice the log y-scale. 



to connect all the efficient solutions. This minimal distance k increases when the 
epistatic degree increases. As an example, for p = —0.2, the average distance is 
equals to 4.3 and 2 for dimension 2 and 5, respectively, when K = 2, whereas it 
is equal to 7.1 and 2.8, respectively, when K = 10. These results meet the previ- 
ous ones on the largest component size: At the same time, the size of the larger 
component decreases, and the distance between efficient solutions increases. 

The average fc-connectcdncss increases also when the objective correlation 
increases. For an objective space dimension 5 and a negative objective correlation 
p = —0.2, it could be possible to reach all non-dominated solutions from another 
one, as the average minimal distance is lower than 3. At the opposite, when the 
objective correlation is positive, it should be easier to find a new non-dominated 
solution by restarting the search from a random solution, rather than exploring 
the neighborhood of a given non-dominated solution such as the distance is 
around the third of the bit string length. When objectives are correlated, less 
solutions are to be found, but knowing some of them will not help to find more. 
Then, the design of an efficient metahcuristic has to be different according to the 
objective correlation. In a two-phase approach, the number of starting solutions 
and the size of the neighborhood can be tuned according to correlation between 
objectives following this study. 



Fig. 5. Average ratio of the size of the larger component of the efficient graph 
and Hamming distance of 1 to the size of the efficient set according to parameter 
p (top left M = 2, right M = 5), and according to parameter K for different 
number of objectives (bottom left p = —0.2, right p = 0.9). 



5 Discussion 

In this paper, we analyzed the consequence of the objective space dimension, 
the non-linearity, and the objective correlation on the structure of multiobjec- 
tive combinatorial search spaces for the design of metaheuristics. We proposed 
a new method to design a multiobjective combinatorial benchmark where the 
correlation between all pairs of objectives can be tuned very precisely. As an 
example, we defined the pAf A^A'-landscapes which extend the multiobjective 
A^X-landscapes. 

Figure [7] shows three examples of pA/iVA'-landscapes in the objective space. 
The number of objective is 2, the parameter K is 4, and length of the bit string 
is 18. This gives a summary of our results in a more intuitive way. When the 
objective correlation is negative, the objectives are in conflict (feasible solutions 
are in green). The efficient set size (in red) is large, and the problem could 
become intractable. In this metaheuristic has to find a limited-size ap- 

proximation of the efficient set only. When the objective correlation is null, as 
in [5], the image of the search space in the objective space can be represented 
as a multidimensional 'bowl'. The objectives are independent. When the objec- 
tive correlation is positive, there exists few solutions in the efficient set. Nearly 
all solutions become supported. Indeed, when the number of objectives is low, 
and when the objective correlation is high, efficient solutions are supported. We 
can conclude that scalar approaches should become more appropriate in such a 



Fig. 6. Average of the minimal Hamming distance to connect all the efficient 
solutions according to parameter p (top left M = 2, right M = 5), and according 
to parameter K for different number of objectives (bottom left p = —0.2, right 
p = 0.9). 



case. The connectedness property is not represented in the last figure. The size of 
larger connected component and the minimal distance to connect all the efficient 
solutions depend on the objective space dimension, the epistatic degree, and also 
on the objective correlation. A two-phase strategy, starting from some efficient 
(supported) solutions, and exploring their neighborhood at a given distance, can 
be tuned according to the results of this work. 

Bringing those properties with the design of local search metaheuristics help 
to make proper choices between several classes of methodologies. This analysis 
shows the importance of the objective correlation on the design of benchmark 
problems, in particular when the number of objectives is higher than 2. In future 
works, we will use some sample technics to study the pM A'' if -landscapes of larger 
size. We will also compare our results on the properties of search space with the 
performance of different metaheuristics. However, the efficient set does not cover 
all the search space properties, so next works will focus on the properties related 
to the Pareto local optima, and to the Pareto local optimum sets. 
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